Search CORE

Comparative Methods for Gene Structure Prediction in Homologous Sequences

Author: A. Krogh
D. S. Hirschberg
I. Korf
J. Hein
J. Hein
M. Burset
O. Gotoh
S. Batzolou
S. Brunak
T. F. Smith
Publication venue: 'Aarhus University Library'
Publication date: 05/06/2002
Field of study

The increasing number of sequenced genomes motivates the use of evolutionary patterns to detect genes. We present a series of comparative methods for gene finding in homologous prokaryotic or eukaryotic sequences. Based on a model of legal genes and a similarity measure between genes, we find the pair of legal genes of maximum similarity. We develop methods based on genes models and alignment based similarity measures of increasing complexity, which take into account many details of real gene structures, e.g. the similarity of the proteins encoded by the exons. When using a similarity measure based on an exiting alignment, the methods run in linear time. When integrating the alignment and prediction process which allows for more fine grained similarity measures, the methods run in quadratic time. We evaluate the methods in a series of experiments on synthetic and real sequence data, which show that all methods are competitive but that taking the similarity of the encoded proteins into account really boost the performance

Tidsskrift.dk (Det Kongelige Bibliotek)

Predicting Secondary Structures, Contact Numbers, and Residue-wise Contact Orders of Native Protein Structure from Amino Acid Sequence by Critical Random Networks

Author: Altschul S. F., Madden, T. L., Sch
Baldi P., Brunak, S., Frasconi, P.
CHANDONIA J-M
Crooks G. E. &amp
Kinjo A. R. &amp
Kinjo A. R. &amp
Kinjo A. R., Horimoto, K. &amp
Lee B. &amp
Li W., Jaroszewski, L. &amp
Nishikawa K. &amp
Pollastri G., Baldi, P., Fariselli
Rost B.
TATENO Y
Publication venue: 'Biophysical Society of Japan'
Publication date: 01/01/2005
Field of study

Prediction of one-dimensional protein structures such as secondary structures and contact numbers is useful for the three-dimensional structure prediction and important for the understanding of sequence-structure relationship. Here we present a new machine-learning method, critical random networks (CRNs), for predicting one-dimensional structures, and apply it, with position-specific scoring matrices, to the prediction of secondary structures (SS), contact numbers (CN), and residue-wise contact orders (RWCO). The present method achieves, on average,

Q_3

accuracy of 77.8% for SS, correlation coefficients of 0.726 and 0.601 for CN and RWCO, respectively. The accuracy of the SS prediction is comparable to other state-of-the-art methods, and that of the CN prediction is a significant improvement over previous methods. We give a detailed formulation of critical random networks-based prediction scheme, and examine the context-dependence of prediction accuracies. In order to study the nonlinear and multi-body effects, we compare the CRNs-based method with a purely linear method based on position-specific scoring matrices. Although not superior to the CRNs-based method, the surprisingly good accuracy achieved by the linear method highlights the difficulty in extracting structural features of higher order from amino acid sequence beyond that provided by the position-specific scoring matrices.Comment: 20 pages, 1 figure, 5 tables; minor revision; accepted for publication in BIOPHYSIC

arXiv.org e-Print Archive

Organizing research data

Author: C Churcher
D Kleppner
EF Codd
International Nucleotide Sequence Database Collaboration
J Gray
National Academy of Sciences
Peter Sestoft
R Snodgrass
R Wilcke
S Brunak
S Lippert
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Research relies on ever larger amounts of data from experiments, automated production equipment, questionnaries, times series such as weather records, and so on. A major task in science is to combine, process and analyse such data to obtain evidence of patterns and correlations

Springer - Publisher Connector

The IT University of Copenhagen's Repository

Cyclebase.org: version 2.0, an updated comprehensive, multi-species repository of cell cycle experiments and derived analysis results

Author: Cho
Diella
Gauthier
Jensen
Lars Juhl Jensen
Lichtenberg
Loog
Menges
Mukherji
Muller
Nicholas Paul Gauthier
Niu
Oliva
Pafilis
Peng
Pramila
Rasmus Wernersson
Rustici
Sopko
Spellman
S⊘ren Brunak
Thomas S. Jensen
Ubersax
Whitfield
Publication venue: 'Oxford University Press (OUP)'
Publication date: 24/11/2009
Field of study

Cell division involves a complex series of events orchestrated by thousands of molecules. To study this process, researchers have employed mRNA expression profiling of synchronously growing cell cultures progressing through the cell cycle. These experiments, which have been carried out in several organisms, are not easy to access, combine and evaluate. Complicating factors include variation in interdivision time between experiments and differences in relative duration of each cell-cycle phase across organisms. To address these problems, we created Cyclebase, an online resource of cell-cycle-related experiments. This database provides an easy-to-use web interface that facilitates visualization and download of genome-wide cell-cycle data and analysis results. Data from different experiments are normalized to a common timescale and are complimented with key cell-cycle information and derived analysis results. In Cyclebase version 2.0, we have updated the entire database to reflect changes to genome annotations, included information on cyclin-dependent kinase (CDK) substrates, predicted degradation signals and loss-of-function phenotypes from genome-wide screens. The web interface has been improved and provides a single, gene-centric graph summarizing the available cell-cycle experiments. Finally, key information and links to orthologous and paralogous genes are now included to further facilitate comparison of cell-cycle regulation across species. Cyclebase version 2.0 is available at http://www.cyclebase.org

Copenhagen University Research Information System

Sequence-based feature prediction and annotation of proteins

Author: Bernsel Andreas
Bork Peer
Brunak Søren
Casadio Rita
Jensen Lars J
Juncker Agnieszka S
Ouzounis Christos A
Pierleoni Andrea
Tress Michael L
Valencia Alfonso
von Heijne Gunnar
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

The combination of prediction tools in complex workflows and pipelines facilitates prediction of protein features from sequence

CiteSeerX

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

MDC Repository

Public Library of Science (PLOS)

Using Electronic Patient Records to Discover Disease Correlations and Stratify Patient Cohorts

Author: Andreatta Massimo
Bredkjær Søren
Brunak Søren
Dalgaard Marlene
Hansen Thomas
Jensen Lars J
Jensen Peter Bjødstrup
Jensen Peter Bjødstrup
Juul Anders
Roque Francisco S
Schmock Henriette
Søeby Karen
Werge Thomas
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Electronic patient records remain a rather unexplored, but potentially rich data source for discovering correlations between diseases. We describe a general approach for gathering phenotypic descriptions of patients from medical records in a systematic and non-cohort dependent manner. By extracting phenotype information from the free-text in such records we demonstrate that we can extend the information contained in the structured record data, and use it for producing fine-grained patient stratification and disease co-occurrence statistics. The approach uses a dictionary based on the International Classification of Disease ontology and is therefore in principle language independent. As a use case we show how records from a Danish psychiatric hospital lead to the identification of disease correlations, which subsequently can be mapped to systems biology frameworks

CiteSeerX

Directory of Open Access Journals

Copenhagen University Research Information System

University of Southern Denmark Research Output